FM5323 Data Science and Machine Learning in Finance
Project 2: Volatility Forecasting
Author
Po-Chin Huang
1Part 0: Preparation
1.1 Import libraries
Code
import numpy as npimport pandas as pdimport yfinance as yfyf.pdr_override()from pandas_datareader import data as pdrfrom arch import arch_modelimport sysimport matplotlib.pyplot as pltfrom sklearn.metrics import mean_absolute_errorfrom sklearn.linear_model import LinearRegression
Implement the all six estimators (close-to-close, parkinson, garman-klass, rogers-sachell, yang-zang and garch(1, 1)) in code.
For the backtest period starting from 1/1/2005 and ending on 4/2/2016, calculate the regression coefficient and \(R^2\) metric for weekly forecasts for each of the estimators, and compare your results to Sepp’s (pg. 41). Do this ONLY for SPY.
2.1 Close to Close
This is a simple volatility estimation method based on daily returns. It typically calculates the standard deviation of daily returns and annualizes it (multiply by \(\sqrt{252}\)) to measure the volatility of an asset.
Code
from sklearn.linear_model import LinearRegressionmdl_reg = LinearRegression(fit_intercept =True)
Close to Close
R^2 : 0.4093645253435927
R^2 Sepp : 0.46
Intercept : 0.049333840950531976
Intercept Sepp: 0.058
Slope: : [0.57068844]
Slope Sepp : 0.48
Bias : 0.01708041424884867
Efficiency : 0.8919571135609284
The regression R² indicating that approximately 40.93% of the variability in the dependent variable is explained by the independent variable. The intercept of 0.0493 suggests a baseline return, while the slope of 0.5707 implies that for each one-unit increase in the independent variable, the dependent variable is expected to rise by 0.5707 units.
2.2 Parkinson
The Parkinson volatility model is based on the difference between the daily high and low prices, using these extreme prices to calculate volatility. This method is often more accurate than the Close-to-Close method because it considers intraday price movements. Its calculation formula uses the ratio between high and low prices and needs to be annualized based on trading days.
The regression R² indicates that approximately 61.81% of the variability in the dependent variable is explained by the independent variables. An Intercept of 0.0138 represents the baseline return, while a Slope of 0.4467 means that for every unit increase in the independent variable, the dependent variable is expected to rise 0.4467 units.
2.3 Garman Klass
The Garman-Klass volatility estimation takes into account the opening, high, low, and closing prices. This method captures price movements more comprehensively and is generally considered more accurate than methods using only closing prices.
The regression R² indicates that approximately 61.3% of the variability in the dependent variable is explained by the independent variables. An Intercept of 0.0147 represents the baseline return, while a Slope of 0.9103 means that for every unit increase in the independent variable, the dependent variable is expected to rise 0.9103 units.
2.4 Rogers Satchell
The Rogers-Satchell volatility model also uses opening, high, low, and closing prices, similar to Garman-Klass. This method emphasizes the variation between different prices in capturing price volatility, potentially providing a more precise measure of volatility.
The regression R² indicates that approximately 59.42% of the variability in the dependent variable is explained by the independent variables. An Intercept of 0.0203 represents the baseline return, while a Slope of 0.8614 means that for every unit increase in the independent variable, the dependent variable is expected to rise 0.8614 units.
2.5 Yang Zhang
The Yang-Zhang is the sum of the overnight volatility (close-to-open volatility) and a weighted average of the Rogers-Satchell volatility and the open-to-close volatility. The assumption of continuous prices does mean the measure tends to slightly underestimate the volatility.
2.5.1 Function
\(\sigma_{YZ} = \sqrt{F} \sqrt{\sigma_{overnight \, volatility}^2 + k\sigma_{open \, to \, close \, volatility}^2 + (1-k)\sigma_{RS}^2}\)
The regression R² indicates that approximately 58.63% of the variability in the dependent variable is explained by the independent variables. An Intercept of 0.0192 represents the baseline return, while a Slope of 0.7599 means that for every unit increase in the independent variable, the dependent variable is expected to rise 0.7599 units.
2.6 Garch
The GARCH (Generalized Autoregressive Conditional Heteroskedasticity) model is a popular time series model specifically used to model and predict volatility in financial markets. GARCH(1, 1) means that the model considers 1 lag of the autoregressive term and 1 lag of the moving average term. The model captures heteroskedasticity in time series by taking into account past volatility and past errors, making it suitable for modeling and forecasting volatility in financial time series.
The regression R² indicates that approximately 56.33% of the variability in the dependent variable is explained by the independent variable. An Intercept of -0.0103 represents the baseline return, while a Slope of 0.6167 means that for every unit increase in the independent variable, the dependent variable is expected to rise 0.6167 units.
3Part 2: Backtest
This backtest will include 10 ETFs in etf_universe.csv of your choosing. Your analysis should include the following:
You will determine the length of the backtest based on data availability, but it should be at least five years. It should go up until the present.
You will use Sepp’s \(R^2\) metric, and another metric of your choosing (or creation). Give a justification for your metric, and why it makes sense.
For each underlying, the garch will probably take a few minutes to run, so I would suggest getting started on this first to make sure you have the data that you need. Make sure to save your garch forecasts in a CSV, so the grader doesn’t have to run code that takes a long time to finish.
Your final conclusion will be your choice of forecast methodology if you were making investment decisions based on these forecasts. Justify your answer with your analysis.
Your analysis should include readable prose, and at least 1-2 visualizations. You will be graded on the neatness and communication quality of your notebook and report.
Include all your code in a Jupyter Notebook and Quarto Document report.
3.1 ETF Selection
The backtesting period is from 2014 to September 2024. Since I want to push forward ten years for GARCH training, the following 10 ETFs are selected with data available starting from 2003.
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
3.3 Main Computation
Declare R_squares, etf_volatility_dict, prediction and etf_MAE outside the for loop to store R², Volatilities, Predictions, and Mean Absolute ErrorE respectively.
In the for loop, each ETF calculates 6 different estimators.
I compared each of the 6 estimators with the realized volatiliy because it would be visually difficult to identify all the estimators in the same plot.Below you will see 10 sets of plots comparing the realized volatility of 10 ETFs with 6 estimators.
Mean absolute error (MAE) is a common metric for evaluating the accuracy of regression models. It measures the average absolute difference between predicted and actual values, which indicates how close the prediction is to the actual outcome.
\(\left| y_i - \hat{y}_i \right|\) : Absolute difference between the actual and predicted values for each observation.
Sice MAE measures absolute differences, it directly reflects the average error without exaggerating larger errors, so it is useful when you want to balance the power of all observations. Unlike metrics such as mean square error (MSE), MAE is less sensitive to large errors. It does not consider the square of the error, but the absolute value of the error, which reduces the impact of extreme values or outliers.
Based on the R² and Mean Absolute Error (MAE), both the Parkinson and Garman-Klass estimators demonstrate slightly better performance compared to the other estimators in general. Furthermore, the performance of the Parkinson and Garman Klass estimators are relatively close to each other. However, there can be instances of overestimation or underestimation in extreme situations. In such cases, referring to other estimators may provide a more reliable and conservative approach. For example, the Parkinson and Garman Klass are more suitable for stable trading days, whereas the Yang-Zhang performs better in handling opening gaps and overnight jumps.
4Reference
18_close_to_close - Pritam Dalal
19_garch - Pritam Dalal
MEASURING HISTORICAL VOLATILITY - Colin Bennett, Miguel A. Gill
Volatility Modelling and Trading - Artur Sepp, Julius Baer